Computing Semantic Relatedness using Wikipedia Link Structure

نویسنده

  • David Milne
چکیده

This paper describes a new technique for obtaining measures of semantic relatedness. Like other recent approaches, it uses Wikipedia to provide a vast amount of structured world knowledge about the terms of interest. Our system, the Wikipedia Link Vector Model or WLVM, is unique in that it does so using only the hyperlink structure of Wikipedia rather than its full textual content. To evaluate the algorithm we use a large, widely used test set of manually defined measures of semantic relatedness as our bench-mark. This allows direct comparison of our system with other similar techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring of Semantic Relatedness between Words based on Wikipedia Links

A novel technique of semantic relatedness measurement between words based on link structure of Wikipedia was provided. Only Wikipedia’s link information was used in this method, which avoid researchers from burdensome text processing. During the process of relatedness computation, the positive effects of two-directional Wikipedia’s links and four link types are taken into account. Using a widel...

متن کامل

Computing Semantic Relatedness from Human Navigational Paths: A Case Study on Wikipedia

In this article, we present a novel approach for computing semantic relatedness and conduct a large-scale study of it on Wikipedia. Unlike existing semantic analysis methods that utilize Wikipedia’s content or link structure, we propose to use human navigational paths on Wikipedia for this task. We obtain 1.8 million human navigational paths from a semi-controlled navigation experiment – a Wiki...

متن کامل

WikiRelate! Computing Semantic Relatedness Using Wikipedia

Wikipedia provides a knowledge base for computing word relatedness in a more structured fashion than a search engine and with more coverage than WordNet. In this work we present experiments on using Wikipedia for computing semantic relatedness and compare it to WordNet on various benchmarking datasets. Existing relatedness measures perform better using Wikipedia than a baseline given by Google ...

متن کامل

WikiWalk: Random walks on Wikipedia for Semantic Relatedness

Computing semantic relatedness of natural language texts is a key component of tasks such as information retrieval and summarization, and often depends on knowledge from a broad range of real-world concepts and relationships. We address this knowledge integration issue with a method of computing semantic relatedness using personalized PageRank (random walks) on a graph derived from Wikipedia. T...

متن کامل

A semantic relatedness metric based on free link structure

While shortest paths in WordNet are known to correlate well with semantic similarity, an is-a hierarchy is less suited for estimating semantic relatedness. We demonstrate this by comparing two free scale networks ( ConceptNet and Wikipedia) to WordNet. Using the Finkelstein353 dataset we show that a shortest path metric run on Wikipedia attains a better correlation than WordNet-based metrics. C...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007